Back

Journal of Computational Chemistry

Wiley

All preprints, ranked by how well they match Journal of Computational Chemistry's content profile, based on 11 papers previously published here. The average preprint has a 0.01% match score for this journal, so anything above that is already an above-average fit. Older preprints may already have been published elsewhere.

1
Why Many Molecular Simulation Research Findings Might Be False: An Analysis of Inter-Simulations Differences Based on Simulation Time and Number of Replicas

Knapp, B.; Deane, C. M.

2022-08-25 bioinformatics 10.1101/2022.08.23.504950 medRxiv
Top 0.1%
7.1%
Show abstract

Molecular simulations are a common technique to investigate the dynamics of proteins, DNA and RNA. A typical application is the simulation of a wild-type structure and a mutant structure where the mutant has a significantly higher (or lower) potency to trigger a signalling cascade. The study would then analyse the observed differences between the wild-type and mutant simulation and link these to their differences. However differences in the simulations cannot always be reproduced by other research groups even if the same parameters as in the original simulations are used. This is caused by the rugged energy landscape of many biological structures which means that minor differences in hardware or software can cause simulation to take different paths. This would not be a problem if the simulation time would be infinitely long but in real life the simulation time is always finite. In this study we use large scale molecular simulations of four different systems (a 10-mer peptide wild-type and mutant as well as a T-cell receptor, peptide and MHC complex as wild-type and mutant) with 100 replicas each totalling 620 000 ns to quantify the magnitude of (non-) reproducibility when comparing inter-simulation differences (e.g. wild-type vs mutant). Using a bootstrapping approach we found that simulation times of at least 2 to 3 times the experimental folding time using a minimum of 3 replicas are necessary for reproducible results. However, for most complexes of interest such long simulation times are far out of reach which means that it is only possible to sample the local phase space neighbourhood of the x-ray structure. To sample this neighbourhood reliably around 10 to 20 replicas are needed. Graphical abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=122 SRC="FIGDIR/small/504950v1_ufig1.gif" ALT="Figure 1"> View larger version (19K): org.highwire.dtl.DTLVardef@530c47org.highwire.dtl.DTLVardef@4b1aeborg.highwire.dtl.DTLVardef@d47906org.highwire.dtl.DTLVardef@155a679_HPS_FORMAT_FIGEXP M_FIG C_FIG

2
Quantum analysis of protein-ligand binding by integrating structural resolution, sequence homology, and ligand properties

Roosan, D.; Samrose, S.; Khan, R.; Nirzhor, S.; Provencher, B.

2025-06-30 molecular biology 10.1101/2025.06.27.661905 medRxiv
Top 0.1%
6.5%
Show abstract

Predicting protein-ligand binding affinity is a fundamental challenge in computational biology and drug discovery, complicated by diverse factors including protein sequence variability, ligand chemical diversity, and structural resolution. Here, we present an integrative study that combines classical machine learning and quantum-enhanced modeling to investigate how crystal structure resolution, sequence similarity, and ligand properties jointly influence binding affinity. Using a curated "refined" dataset from PDBbind and an expanded general dataset, we first conduct correlation and regression analyses to quantify the relationships among binding affinity, ligand descriptors (e.g., molecular weight, logP), and protein structural metrics (resolution, R-factor). We observe moderate positive correlations between ligand size/hydrophobicity and affinity, and a slight negative correlation between resolution and affinity in the refined dataset that largely disappears in the general set. We then train multiple predictive models, including random forests, deep neural networks, and quantum-enhanced approaches--quantum kernel methods, variational quantum circuits, and a hybrid classical-quantum neural network. Experimental results show that quantum-enhanced models perform on par with classical methods in predicting binding affinities and, in some cases, offer modest improvements. Notably, a hybrid quantum-classical model achieves the highest accuracy (Pearson correlation R{approx}0.80R) on the refined dataset. These findings highlight the potential of quantum computing for capturing complex patterns in biomolecular data, laying groundwork for improved structure-based drug design. Our study underscores that while data quality and curation greatly influence observed trends, quantum machine learning despite current hardware limitations can already serve as a competitive and promising tool in computational structural biology. AUTHOR SUMMARYIn our study, we confront a major bottleneck in the creation of new medicines: accurately predicting how strongly a potential drug will attach to its target protein in the body. Getting this right early in the process could save billions of dollars and years of research by preventing dead-ends. We investigated if quantum computing, a new technology that processes information in a fundamentally different way, could provide a better solution. We trained several computer models to predict these binding strengths, comparing standard artificial intelligence (AI) with new models enhanced by quantum machine learning. Our results showed that a hybrid model, which strategically combines classical AI with a quantum-powered component, delivered the most accurate predictions on high-quality, curated data. This work demonstrates that quantum computing is already a competitive tool for real-world biological problems, not just a future possibility. We believe that by further developing these quantum-enhanced approaches, we can create more reliable predictive tools to make the long and difficult search for new drugs faster and more successful.

3
Differentiating Agonists and Competitive Antagonists of the Serotonin Type 3A (5-HT3A) Receptor

Davolio, A. J.; Jankowski, W. J.; Varnai, C.; Irwin, B. W. J.; Payne, M. C.; Chau, P. L.

2023-05-15 molecular biology 10.1101/2023.05.15.540789 medRxiv
Top 0.1%
6.4%
Show abstract

What makes an agonist and a competitive antagonist? In this work, we aim to answer this question by performing parallel tempering Monte Carlo simulations on the serotonin type 3A (5-HT3A) receptor. We use linear response theory to predict conformational changes in the 5-HT3A receptor active site after applying weak perturbations to its allosteric binding sites. A covariance tensor is built from conformational sampling of its apo state, and a harmonic approximation allows us to substitute the calculation of ligand-induced forces with the binding sites displacement vector. We show that it is possible to differentiate between agonists and competitive antagonists for multiple ligands while running computationally expensive calculations only once for the protein.

4
Reinforced molecular dynamics: Physics-infused generative machine learning model explores CRBN activation process

Kolossvary, I.; Coffey, R.

2025-02-17 molecular biology 10.1101/2025.02.12.638002 medRxiv
Top 0.1%
6.3%
Show abstract

We propose a simple and practical machine learning-based desktop solution for modeling biologically relevant protein motions. We termed our technology reinforced molecular dynamics (rMD) combining MD trajectory data and free-energy (FE) map data to train a dual-loss function autoencoder network that can explore conformational space more efficiently than the underlying MD simulation. The key insight of rMD is that it effectively replaces the latent space with an FE map, thus infusing the autoencoder network with a physical context. The FE map is computed from an MD simulation over a low-dimensional collective variable space that captures some biological function. One can directly use then the FE map for example, to generate more protein structures in poorly sampled regions, follow paths on the FE map to explore conformational transitions, etc. The rMD technology is entirely self-contained, does not rely on any pre-trained model, and can be run on a single GPU desktop computer. We present our rMD computations in a key area of molecular-glue degraders aimed at a deeper understanding of the structural transition from open to closed conformations of CRBN.

5
Menger_Curvature : a MDAKit implementation to decipher the dynamics, curvatures and flexibilities of polymeric backbones at the residue level

Reboul, E.; Marien, J.; Prevost, C.; Taly, A.; Sacquin-Mora, S.

2025-04-10 bioinformatics 10.1101/2025.04.04.647214 medRxiv
Top 0.1%
4.8%
Show abstract

Characterizing the dynamics of the backbone of flexible polymers such as Intrinsically Disordered Regions and Proteins (IDRs and IDPs) has proven to be a significant challenge in molecular dynamics (MD) simulations due to the high conformational variability. The widely-used mobility metric Root-Mean-Squared Fluctuations (RMSF) is powerless to provide information as defining a relevant reference structure is often not possible. We previously introduced a new flexibility metric to remedy this gap : the Local Flexibilities (LFs), derived -alongside the Local Curvatures (LCs)- from the Proteic Menger Curvatures (PMCs). Here we present a numba accelerated implementation for any polymer of the calculation of Menger curvatures as a MDAKit from the widely-used MDAnalysis package. We perform a benchmark with the RMSF and another flexibility metric derived from Proteic Blocks (PBs), the Equivalent Number of PBs (Neq), and show that the PMCs are an order of magnitude faster to compute on a modern CPU chip. We applied all 3 flexibility metrics to a {beta}III-tubulin monomer as an example, as tubulins are known to possess the entire range of proteic elements, from -helix and {beta}-sheets to a flexible loop and a disordered C-terminal tail (CTT). RMSF, LFs and Neq all succeed in identifying the flexible loops and the CTT, although the RMSF requires a system-specific alignment to do so. We believe that Menger curvatures will prove to be a valuable metric to study protein dynamics and polymers in general. The MDAKit package Menger_Curvature is readily available at https://github.com/EtienneReboul/menger_curvature

6
Linearised loop kinematics to study pathways between conformations

Hoevenaars, A. G. L.; Andre, I.

2021-04-11 bioinformatics 10.1101/2021.04.11.439310 medRxiv
Top 0.1%
4.8%
Show abstract

AO_SCPLOWBSTRACTC_SCPLOWConformational changes are central to the function of many proteins. Characterization of these changes using molecular simulation requires methods to effectively sample pathways between protein conformational states. In this paper we present an iterative algorithm that samples conformational transitions in protein loops, referred to as the Jacobian-based Loop Transition (JaLT) algorithm. The method uses internal coordinates to minimise the sampling space, while Cartesian coordinates are used to maintain loop closure. Information from the two representations is combined to push sampling towards a desired target conformation. The innovation that enables the simultaneous use of Cartesian coordinates and internal coordinate is the linearisation of the inverse kinematics of a protein backbone. The algorithm uses the Rosetta all-atom energy function to steer sampling through low-energy regions and uses Rosettas side-chain energy minimiser to update side-chain conformations along the way. Because the JaLT algorithm combines a detailed energy function with a low-dimensional conformational space, it is positioned in between molecular dynamics (MD) and elastic network model (ENM) methods. As a proof of principle, we apply the JaLT algorithm to study the conformational transition between the open and occluded state in the MET20 loop of the Escherichia coli dihydrofolate reductase enzyme. Our results show that the algorithm generates semi-continuous pathways between the two states with realistic energy profiles. These pathways can be used to identify energy barriers along the transition. The effect of a single point mutation of the MET20 loop was also investigated and the predicted increase in energy barrier is consistent with the experimentally observed reduction in catalytic rate of the enzyme. Additionally, it is demonstrated how the JaLT algorithm can be used to identify dominant degrees of freedom during a transition. This can be valuable input for a more extensive characterization of the free energy pathway along a transition using molecular dynamics, which is often performed with a reduced set of degrees of freedom. This study has thereby provided the first examples of how linearisation of inverse kinematics can be applied to the analysis of proteins.

7
Accurate and efficient constrained molecular dynamics of polymers through Newton's method and special purpose code

Lopez-Villellas, L.; Kjelgaard Mikkelsen, C. C.; Galano-Frutos, J. J.; Marco-Sola, S.; Alastruey-Benede, J.; Ibanez, P.; Moreto, M.; Sancho, J.; Garcia-Risueno, P.

2022-09-28 molecular biology 10.1101/2022.09.28.509839 medRxiv
Top 0.1%
4.5%
Show abstract

In molecular dynamics simulations we can often increase the time step by imposing constraints on internal degrees of freedom, such as bond lengths and bond angles. This allows us to extend the length of the time interval and therefore the range of physical phenomena that we can afford to simulate. In this article we analyse the impact of the accuracy of the constraint solver. We present ILVES-PC, an algorithm for imposing constraints on proteins accurately and efficiently. ILVES-PC solves the same system of differential algebraic equations as the celebrated SHAKE algorithm, but uses Newtons method for solving the nonlinear constraint equations. It solves the necessary linear systems of equations using a specialised linear solver that utilises the molecular structure. ILVES-PC can rapidly solve the nonlinear constraint equations to nearly the limit of machine precision. This eliminates the spurious forces introduced to simulations through the very common use of inaccurate approximations. The run-time of ILVES-PC is proportional to the number of constraints. We have integrated ILVES-PC into GROMACS and simulated proteins of different sizes. Compared with SHAKE, we have achieved speedups of up to 4.9x in single-threaded executions and up to 76x in shared-memory multi-threaded executions. Moreover, we find that ILVES-PC is more accurate than the P-LINCS algorithm. Our work is a proof-of-concept of the utility of software designed specifically for the simulation of polymers. Author summaryMolecular dynamics simulates the time evolution of molecular systems. It has become a tool of extraordinary importance for e.g. understanding biological processes and designing drugs and catalysts. This article presents an algorithm for computing the forces needed to impose constraints in molecular dynamics, i.e., the constraint forces; moreover, it analyses the effect of the accuracy of the constraint solver. Presently, it is customary to calculate the constraint forces with a relative error that that is not tiny. This is due to the high computational cost associated with the available software. Accurate calculations are possible, but they are very time-consuming. The algorithm that we present solves this problem: it computes the constraint forces accurately and efficiently. Our work will improve the accuracy and reliability of molecular dynamics simulations beyond the present state-of-the-art. The results that we present are also a proof-of-concept that special-purpose code can increase the performance of software for the simulation of polymers. The algorithm is implemented into a popular molecular simulation package, and is now available for the research community.

8
STORMM: Structure and TOpology Replica Molecular Mechanics for chemical simulations

Cerutti, D. S.; Boothroyd, S.; Wiewiora, R.; Sherman, W.

2024-03-28 biophysics 10.1101/2024.03.27.587048 medRxiv
Top 0.1%
4.0%
Show abstract

The Structure and TOpology Replica Molecular Mechanics (STORMM) code is a next-generation molecular simulation engine and associated libraries optimized for performance on fast, multicore central processor units (CPUs) and graphics processing units (GPUs) with independent memory and tens of thousands of threads. STORMM is built to run thousands of independent molecular mechanical calculations on a single GPU with novel implementations that optimize numerical precision, mathematical operations, throughput, and resource management. The libraries are built around accessible classes with detailed documentation, supporting fine-grained parallelism and algorithm development as well as macroscopic manipulations of groups of systems on and off of the GPU. A primary intention of the STORMM libraries is to provide developers of atomic simulation methods with access to a high-performance molecular mechanics engine with extensive facilities to prototype and develop bespoke tools aimed toward drug discovery applications. In its present state, STORMM delivers molecular dynamics simulations of small molecules and small proteins in implicit solvent with tens to hundreds of times the throughput of conventional codes. The engineering paradigm also transforms two of the most memory bandwidth-intensive aspects of condensed-phase dynamics, particle-mesh mapping and valence interactions, into compute-bound problems for several times the scalability of existing programs. Numerical methods for getting the most out of each bit of information present in stored coordinates and lookup tables are also presented, delivering improved accuracy over methods implemented in other molecular dynamics engines. The open-source code is released under the MIT license.

9
Accelerating Prediction of Chemical Shift of Protein Structures on GPUs

Wright, E.; Ferrato, M.; Bryer, A.; Searles, R.; Perilla, J. R.; Chandrasekaran, S.

2020-01-14 biophysics 10.1101/2020.01.12.903468 medRxiv
Top 0.1%
4.0%
Show abstract

Experimental chemical shifts (CS) from solution and solid state magic-angle-spinning nuclear magnetic resonance spectra provide atomic level information for each amino acid within a protein or protein complex. However, structure determination of large complexes and assemblies based on NMR data alone remains challenging due the complexity of the calculations. Here, we present a hardware accelerated strategy for the estimation of NMR chemical-shifts of large macromolecular complexes based on the previously published PPM_One software. The original code was not viable for computing large complexes, with our largest dataset taking approximately 14 hours to complete. Our results show that the code refactoring and acceleration brought down the time taken of the software running on an NVIDIA V100 GPU to 46.71 seconds for our largest dataset of 11.3M atoms. We use OpenACC, a directive-based programming model for porting the application to a heterogeneous system consisting of x86 processors and NVIDIA GPUs. Finally, we demonstrate the feasibility of our approach in systems of increasing complexity ranging from 100K to 11.3M atoms. Author summary

10
Complete reconstruction of the unbinding pathway of an anticancer drug by conventional unbiased molecular dynamics simulation

Sohraby, F.; Javaheri Moghadam, M.; Aliyar, M.; Aryapour, H.

2020-02-25 bioinformatics 10.1101/2020.02.23.961474 medRxiv
Top 0.1%
4.0%
Show abstract

Understanding the details of unbinding mechanism of small molecule drugs is an inseparable part of rational drug design. Reconstruction of the unbinding pathway of small molecule drugs, todays, can be achieved through molecular dynamics simulations. Nonetheless, simulating a process in which a drug unbinds from its receptor demands lots of time, mostly up to several milliseconds. This amount of time is neither reasonable nor affordable; therefore, many researchers utilize various biases that there are still many doubts about their trustworthiness. In this work we have utilized short-run simulations, replicas, to make such time-consuming process cost effective. By replicating those snapshots of the trajectories which, after careful analyses, were selected as potential candidates we increased our systems efficiency considerably. As a matter of fact, we have implemented a sort of human bias, inspecting trajectories visually, to achieve multiple unbinding events. We would like to call this stratagem, replicating of potent snapshots, "rational sampling" as it is, in fact, benefiting from human logic. In our case, an anticancer drug, the dasatinib, completely unbounded from its target protein, c-Src kinase, in only 392.6 ns, and this was gained without applying any internal biases and potentials which can increase error level. Thus, we achieved important structural details that can alter our viewpoint as well as assist drug designers.

11
LPATH: A semi-automated Python tool for clustering molecular pathways

Bogetti, A.; Leung, J. M.; Chong, L.

2023-08-20 biophysics 10.1101/2023.08.17.553774 medRxiv
Top 0.1%
3.9%
Show abstract

The pathways by which a molecular process transitions to a target state are highly sought-after as direct views of a transition mechanism. While great strides have been made in the physics-based simulation of such pathways, the analysis of these pathways can be a major challenge due to their diversity and variable lengths. Here we present the LPATH Python tool, which implements a semi-automated method for linguistics-assisted clustering of pathways into distinct classes (or routes). This method involves three steps: 1) discretizing the configurational space into key states, 2) extracting a text-string sequence of key visited states for each pathway, and 3) pairwise matching of pathways based on a text-string similarity score. To circumvent the prohibitive memory requirements of the first step, we have implemented a general two-stage method for clustering conformational states that exploits machine learning. LPATH is primarily designed for use with the WESTPA software for weighted ensemble simulations; however, the tool can also be applied to conventional simulations. As demonstrated for the C7eq to C7ax conformational transition of alanine dipeptide, LPATH provides physically reasonable classes of pathways and corresponding probabilities. TOC Graphic O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=111 SRC="FIGDIR/small/553774v2_ufig1.gif" ALT="Figure 1"> View larger version (15K): org.highwire.dtl.DTLVardef@14eed4corg.highwire.dtl.DTLVardef@bd1f67org.highwire.dtl.DTLVardef@58c04borg.highwire.dtl.DTLVardef@b88034_HPS_FORMAT_FIGEXP M_FIG C_FIG

12
Modeling stereospecific drug interactions with beta-adrenergic receptors

Dawson, J. R. D.; DeMarco, K. R.; Han, Y.; Bekker, S.; Clancy, C. E.; Yarov-Yarovoy, V.; Vorobyov, I.

2023-10-01 molecular biology 10.1101/2023.10.01.560334 medRxiv
Top 0.1%
3.8%
Show abstract

Beta adrenergic receptors ({beta}ARs) are G protein-coupled receptors that control processes as varied as heart rhythm and vascular tone by binding agonists such as norepinephrine to induce downstream signaling pathways. Beta blockers antagonize {beta}ARs to downregulate their activity, thus reducing heart rate and lowering vascular tone. We developed new Rosetta structural modeling protocol to develop state-specific models of {beta}1AR, expressed in cardiac myocytes, as well as {beta}2AR, expressed in the smooth muscle cells of vasculature and other tissues, and their atomistic-scale interactions with beta-blockers using RosettaLigand. We identified structural features of drug - receptor interactions, which may account for their receptor conformational state and drug stereospecific preferences. Furthermore, we estimated structural stabilities of our models using atomistic molecular dynamics (MD) simulations. In our recent study we validated our structural models of norepinephrine-bound {beta}2AR and its complex with stimulatory G protein via multi-microsecond MD simulations. Thus, here we mostly focused on state-dependent and stereospecific {beta}1AR interactions with beta-blocking drugs sotalol and propranolol. We observed expected inactive receptor state preferences and structural stabilities of our models in MD simulations, but neither those simulations nor RosettaLigand docking could clearly distinguish stereospecific preferences of those drugs. This warrants consideration of alternative hypotheses and enhanced sampling MD simulations, which we discussed as well. Nevertheless, our study provides basis for understanding conformational state selectivity and stereospecificity of beta-blockers for {beta}ARs, important pharmacological targets, and may be extended to other drug classes and receptor types. Graphical abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=157 SRC="FIGDIR/small/560334v1_fig0.gif" ALT="Figure 0"> View larger version (82K): org.highwire.dtl.DTLVardef@e8a838org.highwire.dtl.DTLVardef@7c2743org.highwire.dtl.DTLVardef@f5ac43org.highwire.dtl.DTLVardef@100a3d3_HPS_FORMAT_FIGEXP M_FIG Norepinephrine (NE) bound active-state beta-1 adrenergic receptor (1AR) in complex with the stimulatory G protein (Gs) heterotrimer embedded in a lipid bilayer. When expressed at the plasma membrane, the 1AR is oriented such that the ligand binding pocket (*) is accessible to ligands from the extracellular side (Ex.) of the membrane. The Gs (red), G (blue), and G{gamma} (yellow) subunits comprise the Gs heterotrimer. Nucleotides GDP or GTP bind G at the P-loop (**). Inset: Representative image of NE bound within the orthosteric ligand binding pocket. C_FIG

13
Brewing COFFEE: a sequence-specific coarse-grained energy function for simulations of DNA-protein complexes

Chakraborty, D.; Mondal, B.; Thirumalai, D.

2023-06-08 biophysics 10.1101/2023.06.07.544064 medRxiv
Top 0.1%
3.7%
Show abstract

DNA-protein interactions are pervasive in a number of biophysical processes ranging from transcription, gene expression, to chromosome folding. To describe the structural and dynamic properties underlying these processes accurately, it is important to create transferable computational models. Toward this end, we introduce Coarse grained force field for energy estimation, COFFEE, a robust framework for simulating DNA-protein complexes. To brew COFFEE, we integrated the energy function in the Self-Organized Polymer model with Side Chains for proteins and the Three Interaction Site model for DNA in a modular fashion, without re-calibrating any of the parameters in the original force-fields. A unique feature of COFFEE is that it describes sequence-specific DNA-protein interactions using a statistical potential (SP) derived from a dataset of high-resolution crystal structures. The only parameter in COFFEE is the strength ({lambda}DNAPRO) of the DNA-protein contact potential. For an optimal choice of{lambda} DNAPRO, the crystallographic B-factors for DNA-protein complexes, with varying sizes and topologies, are quantitatively reproduced. Without any further readjustments to the force-field parameters, COFFEE predicts the scattering profiles that are in quantitative agreement with SAXS experiments as well as chemical shifts that are consistent with NMR. We also show that COFFEE accurately describes the salt-induced unraveling of nucleosomes. Strikingly, our nucleosome simulations explain the destabilization effect of ARG to LYS mutations, which does not alter the balance of electrostatic interactions, but affects chemical interactions in subtle ways. The range of applications attests to the transferability of COFFEE, and we anticipate that it would be a promising framework for simulating DNA-protein complexes at the molecular length-scale. Graphical TOC Entry O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=81 SRC="FIGDIR/small/544064v2_ufig1.gif" ALT="Figure 1"> View larger version (22K): org.highwire.dtl.DTLVardef@1190c18org.highwire.dtl.DTLVardef@169098eorg.highwire.dtl.DTLVardef@f24e75org.highwire.dtl.DTLVardef@1fd0bd1_HPS_FORMAT_FIGEXP M_FIG C_FIG

14
From sequence to Boltzmann weighted ensemble of structures with AlphaFold2-RAVE

Vani, B. P.; Aranganathan, A.; Wang, D.; Tiwary, P.

2022-05-26 biophysics 10.1101/2022.05.25.493365 medRxiv
Top 0.1%
3.7%
Show abstract

While AlphaFold2 is rapidly being adopted as a new standard in protein structure predictions, it is limited to single structure prediction. This can be insufficient for the inherently dynamic world of biomolecules. Even with recent modifications towards conformational diversity, AlphaFold2 is devoid of providing thermodynamically ranked conformations. AlphaFold2-RAVE is an efficient protocol using the structural outputs from AlphaFold2 as initializations for AI augmented molecular dynamics. These simulations result in Boltzmann ranked ensembles, which we demonstrate on different proteins.

15
Exploring RNA conformational ensembles in silico: progress and challenges

Roeder, K.; Stirnemann, G.; Meuret, L.; Barquero-Morera, D.; Forget, S.; Wales, D. J.; Pasquali, S.

2026-02-18 molecular biology 10.64898/2026.02.18.706514 medRxiv
Top 0.1%
3.6%
Show abstract

RNA function is intrinsically linked to its structural polymorphism, with molecules exploring the heterogeneous conformational ensembles resulting from complex energy landscapes. These landscapes arise from competing interactions, small energetic separations between microstates, and strong coupling to the environment, posing significant challenges for both experimental characterization and molecular simulation. In this chapter, we review current computational strategies that aim to explore RNA conformational ensembles in silico, with a specific focus on energy landscape-based approaches and atomistic simulations. We discuss key limitations related to sampling efficiency, force-field accuracy, and ensemble analysis, and illustrate their impact through case studies on a self-cleaving ribozyme and an H-type pseudoknot. Finally, we highlight emerging directions, including closer integration with experimental data and the growing role of machine learning, which will probably reinforce the predictive power of in silico RNA energy landscape exploration.

16
Predicting unknown binding sites for transition metal based compounds in proteins

Levy, A.; Rothlisberger, U.

2026-02-03 bioinformatics 10.64898/2026.01.29.702545 medRxiv
Top 0.1%
3.6%
Show abstract

Transition metal based compounds are promising therapeutic agents, particularly in cancer treatment. However, predicting their binding sites remains a major challenge. In this work, we investigate the applicability of two tools, Metal3D and Metal1D, for this purpose. Although originally trained to predict zinc ion binding sites only, both predictors successfully identify several experimentally observed binding sites for transition metal complexes directly from apo protein structures. At the same time, we highlight current limitations, such as the sensitivity to side-chain conformations, and discuss possible strategies for improvement. This work provides a first step toward establishing a robust computational pipeline in which rapid and low-cost predictors are able to identify putative hotspots for transition metal binding, which can then be refined using more accurate but computationally demanding methods. Author summaryTransition metals play a crucial role as therapeutic agents, especially in cancer therapy. However, the prediction of their binding site locations is challenging, as accurate computational methods often require time-consuming simulations, making them impractical when many possible binding sites must be explored. In this work, we explored the capability of two binding site predictors, originally developed to locate metal ions in proteins, to identify binding sites for more complex covalently-bound transition metal based agents. We found that these tools can often identify the experimentally-known binding regions, even when starting from the apo structure, in which the protein does not already contain the metal compound. At the same time, our results show clear limitations in more challenging cases, particularly when the binding involves only a single amino acid or when the binding site undergoes major structural rearrangements upon binding. Overall, our study shows that fast predictors can provide valuable early insights in the investigation of the binding sites of covalently-bound transition metal based compounds. When combined with more accurate simulation techniques, they can help focus computational efforts and ultimately support the rational design of transition metal based drugs.

17
Designing Peptides on a Quantum Computer

Mulligan, V. K.; Melo, H.; Merritt, H. I.; Slocum, S.; Weitzner, B. D.; Watkins, A. M.; Renfrew, P. D.; Pelissier, C.; Arora, P. S.; Bonneau, R.

2019-09-02 bioengineering 10.1101/752485 medRxiv
Top 0.1%
3.6%
Show abstract

Although a wide variety of quantum computers are currently being developed, actual computational results have been largely restricted to contrived, artificial tasks. Finding ways to apply quantum computers to useful, real-world computational tasks remains an active research area. Here we describe our mapping of the protein design problem to the D-Wave quantum annealer. We present a system whereby Rosetta, a state-of-the-art protein design software suite, interfaces with the D-Wave quantum processing unit to find amino acid side chain identities and conformations to stabilize a fixed protein backbone. Our approach, which we call the QPacker, uses a large side-chain rotamer library and the full Rosetta energy function, and in no way reduces the design task to a simpler format. We demonstrate that quantum annealer-based design can be applied to complex real-world design tasks, producing designed molecules comparable to those produced by widely adopted classical design approaches. We also show through large-scale classical folding simulations that the results produced on the quantum annealer can inform wet-lab experiments. For design tasks that scale exponentially on classical computers, the QPacker achieves nearly constant runtime performance over the range of problem sizes that could be tested. We anticipate better than classical performance scaling as quantum computers mature.

18
Bayesian Maximum Entropy Ensemble Refinement

Eltzner, B.; Hofstadler, J.; Rudolf, D.; Habeck, M.; de Groot, B.

2023-09-15 bioinformatics 10.1101/2023.09.12.557310 medRxiv
Top 0.1%
3.6%
Show abstract

The principle of maximum entropy provides a canonical way to include measurement results into a thermodynamic ensemble. Observable features of a thermodynamic system, which are measured as averages over an ensemble are included into the partition function by using Lagrange multipliers. Applying this principle to the systems energy leads to the well-known exponential form of the Boltzmann probability density. Here, we present a Bayesian approach to the estimation of maximum entropy parameters from nuclear Overhauser effect measurements in order to achieve a refined ensemble in molecular dynamics simulations. To achieve this goal, we leverage advances in the treatment of doubly intractable Bayesian inference problems by adaptive Markov Chain Monte Carlo methods. We illustrate the properties and viability of our method for alanine dipeptide as a simple model system and trp-cage as an example for a more complex peptide.

19
Benchmarking GROMACS on Optimized Colab Processors and the Flexibility of Cloud Computing for Molecular Dynamics

Karagöl, T.; Karagöl, A.

2024-11-15 bioinformatics 10.1101/2024.11.14.623563 medRxiv
Top 0.1%
3.6%
Show abstract

Molecular dynamics (MD) simulations are widely used computational tools in chemical and biological sciences. For these simulations, GROMACS is a popular open-source alternative among molecular dynamics simulation software designed for biochemical molecules. In addition to software, these simulations traditionally relied on costly infrastructure like supercomputers or clusters for High-Performance Computing (HPC). In recent years, there has been a significant shift towards using commercial cloud providers computing resources, in general. This shift is driven by the flexibility and accessibility these platforms offer, irrespective of an organizations financial capacity. Many commercial compute platforms such as Google Compute Engine (GCE) and Amazon Web Services (AWS) provide scalable computing infrastructure. An alternative to these platforms is Google Colab, a cloud-based platform, provides a convenient computing solution by offering GPU and TPU resources that can be utilized for scientific computing. The accessibility of Colab makes it easier for a wider audience to conduct computational tasks without needing specialized hardware or otherwise costly infrastructure. However, running GROMACS on Colab also comes with limitations. Google Colab imposes usage restrictions, such as time limits for continuous sessions, capped at several hours, and limits on the availability of high-performance GPUs. Users may also face disruptions due to session timeouts or hardware availability constraints, which can be challenging for large or long-running molecular simulations. We have significantly enhanced the performance of GROMACS on Google Colab by re-compiling the software, compared to its default pre-compiled version. We also present a method for integrating Google Drive to save and resume interrupted simulations, ensuring that users can secure files after session-timeouts. Additionally, we detail the setup and utilization of the CUDA and MPI environment in Colab to enhance GROMACS performance. Finally, we compare the efficiency of CUDA-enabled GPUs with Googles TPUv2 units, highlighting the trade-offs of each platform for molecular dynamics simulations. This work equips researchers, students, and educators with practical MD tools while providing insights to optimize their simulations within the Colab environment.

20
A method for detection of permeation events in Molecular Dynamics simulations of lipid bilayers

Camilo, C. R. d. S.; Ruggiero, J. R.; de Araujo, A. S.

2021-01-21 bioinformatics 10.1101/2021.01.20.427278 medRxiv
Top 0.1%
3.6%
Show abstract

The cell membrane is one of the most important structures of life. Understanding its functioning is essential for several human knowledge areas, mainly how it controls the efflux of substances between the cytoplasm and the environment. Being a complex structure, composed of several classes of compounds such as lipids, proteins, sugars, etc., a convenient way to mimic it is through a phospholipid bilayer. The Molecular Dynamics simulation of lipid bilayers in solution is the main computational approach to model the cell membrane. In this work, we present a method to detect permeation events of molecules through the lipid bilayer, characterizing its crossing time and trajectory. By splitting the simulation box into well-defined regions, the method distinguishes the passage of molecules through the bilayer from artifacts produced by crossing molecules through the simulation box edges when using periodic boundary conditions. We apply the method to study the spontaneous permeation of water molecules through bilayers with different lipid compositions and modeled with different force fields. Our method successfully characterizes the permeation events, and the results obtained show that the frequency and time of permeation are independent of the force field used to model the phospholipids. Besides, it is observed that the increase in the concentration of cholesterol molecules in lipid bilayers induces the reduction of permeation events due to its compacting action on the bilayer, making it denser and, therefore, hindering the diffusion of water molecules inside it. The computational tool to perform the method discussed here is available on https://github.com/crobertocamilo/MD-permeation.